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ABSTRACT 

Although single case studies might be useful to 
evaluators for a variety of purposes, there are no generally accepted 
ways for drawing inferences about the generality or findings froma 
case study. Single case studies are defined in this paper as either 
studies of single events, or disaggregated studies of sultiple 
events. The data may be qualitative or quantitative, and may be 
derived from controlled experiments or from observation. There are 
two spans to the bridge of inference. The statistical span connects 
the experimental sample to a popluation just like that sample. The 
second span connects the population to a group judged to be 
sufficiently similar. In case law or in clinical practice, the 
‘judgment of sufficient similarity --that is, the judgment of the 
appropriateness of the generalization--is made by the user. This 
application of single case data may also be appropriate in 
educational evaluation. (CTM) 
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Mary M. Kennedy 
I chose the topic of generalizability because it seemed to me 
that, although single case studies might be useful for a variety of 


purposes, they presented difficulties to any evaluator who wished to 


ipsicratize his findings. There are no generally accepted ways for 


drawing inferences about the generality of findings from a case study, 


or even from studies of a very few cases. Advocates of single case 


methodologies have used a variety of arguments to overcome the problems 


of sampling liatiations, but none has satisfied those who rely on 

multiple cases to draw generalizable findings from studies. Some 

authors (e.g., Feinberg, 1977) have erroneously assumed that by. increasing 
the number of data points on a single case, one has eliminated the problem. 
But this is net a solution, for these several data points are still 

based on only ona subject. Other authors have argued the importance of 
single-case methodologies because they can accomodate alternativeepistem- 
ological viewpoints (e.g., Rist, 1977; Stake, 1978). In these discussions, 
arguments are made for the validity of qualitative data, subjective 


impressions, or descriptions of naturally occuring events. “While these 


arguments may be valid, they suggest that the evaluator needs the talent 

of Tolstoi to be able to describe these events in ways that allowed a reader 
to draw the appropriate fudereries. Still other authors (e.g., Edgar and 
Billingsley, 1974; Edgington, 1967) have argued for the application of 
nonstatistical, but still logical, rules of generalization. It is important 
to realize that non-statistical arguments need not be invalid. Yet, 
many researchers may be timid about attempting such ‘inferences simple 
because the. rules as to what constitutes a reasonably sound inference 


are ambigous, relative to the rules as to what constitites a sound statis- 
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tical inference. What is needed are rules of inference that reasonable 

people can agree on. If these rifles were applied and revised over time, 

it might be possible for us to be as comfortable with such an inferential 

process as we are with the statistical inferential process. In this paper, 

I will address myself to such a set of rules., and will focus only on 

rules that apply when samples are restricted. By doing so, I will not 
* concern myself with a number of other issues that can be raised regarding 

single-case methodologies, such as the rules by which causal inferences 

are drawn, or the way in which data are collected.. In fact, there are 

a number of issues that are important, but that I will not get into, so 


it might be well for me to start by listing some of these. 


First, I won't talk about quantitative versus qualitative obser- 
‘ vations. Either kind of observations can be made in single case or 
multiple case studies. Though the issue is an important one, it is 
not necessarily related to the size of the sample. 

Second, I will not address the relative values of controlled. 
experiments versus observation of naturally occurring events. Again, 
either approach can be used on any sample size. I will, however, 
assume in this discussion that there is a treatment of interest, and 
that the goal of the study is to generate some form of generalizable 
_ knowledge about that treatment. 
~ Finally, I hope ‘not to go into the problem of the appropriateness 
of different units of analysis, except to the extent that.the choice of 
units may prove to be related to generalizability. It is just as pos- 
sbile to'select a wrong unit in a single-case study as in a multiple- 
case study, although it may be easier in a single-case study to see 


’ the fallacy of wrong rgasoning when ae time to generalize about 
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There is one topic related to the single versus multiple case 
study, however, that I cannot ignore, and that is the subject of . 
aggregation. For purposes of this paper, I will define Single case 
studies as either (a) studies of single events, or (b) disaggregated 
studies of multiple events. That is, it is possible to study more 
than one case, but to study them individually, rather than averaging 
or in other ways pooling the data across cases. The reason that I 
make this distinction is that the disaggregated study of multiple 
cases is not often considered as an alternative, and it may prove to 
be a Panel strong approach to evaluation. I will consider the 


biciiiue of generalization from either of these two types of winates 


case studies. 


The reason it is necessary to specify that these several topics 
will not be discussed is that they often are discussed jin debates 
about the merits of single-case studies. It seems necessary in these 


debates to describe the attributes that these studies could have that 


_ are not attributes of some other types of studies. For my purposes, I 


would like to confine myself.solely to di”ferences of sample size. 
Thus, the comparison against which I will contrast single studies 
can simply be labled the "multiple-case study". Multiple-case studies 


are popular because they allow the application of a variety of statis- 


tical techniques to the data-to-determine_the generalizability of the 


findings. With proper sampling techniques, the variance of the sample 


can be used to estimate the variance of the population, and it has even 


been argued (e.g., McNemar, 1940) that the existence. of human variation 


forces psychological research to rely on sampling. Clearly a single- 
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case study cannot meet this criterion. And without an estimate of 
population variability, no basis exists for statistical inference — 
about. the population (Edgington, 1967). 

This single weakness of single-case studies is sufficient to 
‘discourage many researchers from employing them, in spite of demon- 
strations of the influence single-case studies have ha on theory 
(Dukes, 1965), and a variety of other arguments as to i advantages 
(Herson-and Barlow, 1972; Edgar and Billingsley, 1974). \ 

In this paper, I will attempt to explicate what some alternate 
rules for generalization inferences might be. The rules that I wi 
offer are far from being complete or fully developed, but may offer % 
a base which can later be refined by practicing researchers. * 

The Problem of Generalization ' 

Let me start by reminding you that an inference of generalization 

is always tentative. That is, data might offer confirming on discon- 


firming evidence, but never concfusive evidence. Not even in multiple- 
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case studies can the evaluator generate conclusive evidence of generaliz- 


ability. The strength of the evidence is a matter of judgment. For 
that reason, a good evaluator tries to make clear the strength of his 


evidence. A currently popular term for the generalizability of a find- 


ing is’ external validity (Campbell and Stanley, 1963), but-a more 
appropriate term would be "strength of external validity", or "strength 
of generalizability", since these terms suggest--that generalization is 


a judgment cf degree, rather than by a binary decision. 
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A second point about generalization is that it is not simply a 
funetton of the nuinber of units one has observed. More important are 
the kinds of units observed, that is, the range of ‘characteristics of 
the units investigated and the range of conditions under which observa- 
tion occurred. The range of characteristics included in a sample 
increases the range of population characteristics to which generaliza- 


tion is possible. Thus, generalizations may vary in their range as well 


“as in the strength of their arguments. It should be clear, then, that 


a wider range of generalization is not necessarily achieved by increas- 
ing the sample size.. Large samples may be selected for their geographic 
convenience, their political accessibility, or other irrelevant factors, 
and may consist of highly homogeneous groups. For example, an investi- 
gation of treatment effect on 100 students in a "college town" elemen- 
tary school iia have a narrower range of generalization than a study 


of 10 children whose parents range in income and education levels and 


whose homes are in rural, suburban, and urban areas. Cornfield and 


Tukey (1956) described inference as having two spans. One, a statis- 
tical span, connects the sample to a population just like the sample. 
The second span connects to a population believed or assumed to be 
sufficiently similar that the study findings apply there as well. 

The second span cannot rely on statistics for aeststanee, but is not 
necessarily less valid. I think when we consider single case studies, 
we are forced to rely almost completely on the second span for our 
inferences. And the rules for-generalizing across that span are much 
more ambiguous than are the statistical rules for generalizing across 


the first span. I am not capable of providing such.a set of rules, but 
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would like to take this opportunity to offer a starting point 


beginning with designs with disaggregated replications of single-cases. 


Generalizing from Replicated Single-Case Studies 


I begin with the problem of generalizing from replicated single 
cases, primarily because these designs are easier to generalize from 
than are designs with single case alone. For purposes of this discus- 
sion, however, let us assume that the number of replications is still 
quite small -- somewhere between 2 and 10, and consider some criteria 
that might be used to generalize from these cases. Criteria for 


generalizing begin with the attributes of the sample cases. 


Criteria for Saniple Attributes 
1. Wide Range of Attributes across the Sample Cases. 


I already referred to this criterion earlier and mentioned that 
it applied to croup studies as well as sinale-case studies. What. 


is important here, though, is to recognize that it is possible to 


encompass a wide range of attributes with only a few cases. Six 


school districts can vary considerably in their size, income, and 

in their community's educational levels. Six children can similarly 
vary. But because of the possibility in these designs of co-fusing 
idiosyncratic outcomes with generalizable outcomes, we need other 


criteria to strengthen the generalization. 


Zé Many Common Attributes between Sample sases{s}a and the 


Population of Interest + - 
To the extent that we can identify all those attributes that are 
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common between the sample and one or more populations of interest, 
these conmon features may form a basis for generalization. To deter- 
mine the common attributes, the evaluator may need to define the at- 
tributes of population(s) of interest. This can be done in one of 
three ways -- the evaluator may rely on prior knowledge of the popula- 
_ tion attributes, he may make assumptions about. its attributes, or he 
“may be able to define a hynothetical nopulation of interest by its attri- 
butes. Any of these three Seomecie is appropriate, and all are 
probably equally feasible. I say this on the assumption that no 
evaluation is done in a vacuum, but instead follows. on a continuing 
— of studies, cach of which has contributed in Some Way to the 
body of knowledge that the evaluator of a new project has at his: 
disposal. . 

Now there are obviously situations where an evaluator may want to 
generalize but not to any particular population. That is, he merely 
wants to have his findings apply as much as possible to other situa- 
tions. It would still be possible for such an evaluator to describe 
several "common" features of his sample. For example, he can identify 
all of the normative characteristics of his sample cases - a child's 
IQ or reading level, a school district's wealth or size, and argue 
that his findings generalize to "similar" cases. Now, the first cri- 
terion, the range of attributes in the sample cases, increase the 
range of generalization. This second criterion, the number of ,attri- 
butes in common with the population, increases the strength of the 

* “generalization. But, of course, not any attributes will do. If irrele- 


ee 


vant attributes were employed, and dozens listed off, they would do 
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little to strengthen our case. I will come to a criterion of relevance : 
later on. 
3. Few Unique Attributes in the Sample Case(s) 
This criterion is actually the converse of the preceding. To the 
extent that our sample has unique attributes, these attributes inter- 
fere with generalizability. But unique attributes are more difficult 
to isolate. They might include special circumstances, for example, 
such as the recent death of a child's parents, on an unusual method 
by which school board members are selected i a sample district. 
Unique attributes may often be attributes that one would never. think 
to look for. Yet one must attempt to find these, for it is only be 
separating the unique from the common features that the relationship 
baiueeh a sample and a population can be defined. It would be accept- 
able, I think, to define unique attributes post hoc. At least it 
would be preferable to not defining unique attributes -at all. 
4, Relevance of Attributes 7 

We have considered three criteria for sample attributes that are 
necessary for generalizing. The first was the range of attributes included 
in the sample, the second was the number of similarities between the 
sample and the population(s) of interest. Our third was the lack of 
unique features. This fourth now follows on the first three and requires 
that the attribute previously identified be relevant. 

ne adueatl oni studies, for example, attributes such as children’ S 
hair color would not be relevant. hei ther would attributes of schools 


such as the brand of their furnaces. Determining which attributes are 


in 
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relevant, then, requires some analysis of the treatment itself, or of the 
hypotheses or study questions of interest. For starters, one would want 
to lists ; ; 
(a) those attributes which the treatment is designed to influence 
(e.g., a child's reading level, or a school's compliance with | 
a federal regulation) 
(b) those attributes known from prior experience to be related te--* 
the first set of attributes (e.g., the educational level of 
the child's parents; the resources available to the school) 
(c) those attributes hypothesed by other researchers or evaluators 
to be related, but which lack substantial evidence of 
relationship. 
Where sufficient prior knowledge is available, these attributes can 


algo be characterized by the strength of the evidence that they are 


causally related, and by the nature of their apparent relationship to 
the outcome of interest (e.g., necessary but not sufficient condition, 
neither necessary nor sufficient, etc.) 

Notice that these criteria for revelance could only be useful 
for common attributes, and not for unique. The relevance of untaué 
attributes cannot be aa the basis of prior experience, pre- 
cisely because the feature is unique. The releyance of unique attri- 
butes can only be determined by an analysis of their relationship to 
the treatment during the course of the study. To see how this might 
be done, let us now turn from arguments of generalization. that are 


based on the sample to those that are based on the treatment. 
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_ Criteria for Attributes of the Treatment 
Characteristics of the treqtment are rarely considered as useful 
evidence for generalization, perhaps because, in multiple-case studies, 
documentation of treatment characteristics would be too time-consuming 
of a task. -But the single-case approach can not only document the 
effects of the treatment but also the reasons for those effects. In 
fact, a single-case approach forces the evaluator to analyze the func- 
tional relationship between the treatment and the subject. 
Example. Suppose an evaluator is assessing the effectiveness 
of computer assisted énebyuntion (CAI) on only six students. He chooses 
P to disaggregate his sample, and Finds that the CAI was highly success- 
ful for five of the six students he had tried CAI with. Now, the 


five successful cases may represent a range of academic abilities, may 
have varied in their experience with. automated devices, and may have 
come from different classrooms. How, then, does the sixth student 
differ? This example could have any number of endings. The sixth 
student may have had a slightly lower ability than the lowest of the 
other five, leading the evaluator to hypothesize that a minimal level 
of competence is necessary to function well with the machine. Or he 
may have had a previously undiscovered visual impairment which made 
words ona lighted screen difficult for him to read. Conversely, the 
evaluator might have discovered that each of the five "success" stories 
are due to unique features of these five students. They may have had 

a variety of emotional problems or family crises which were easier to 
escape from during computerized instruction than during classroom instruc- 
tion. Any of these findings provide the evaluator with further clues ~ 


as to how the treatment functions with regards to recipients. Our con- 
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‘cern in this section, then is to determine what the criteria are that 
allow generalization froma functional analysis of thaetreatment. 

Just as the cantte attributes are defined with reliance on prior 
knowledge, so are the treatment attributes. After all, the treatment 
was not developed ina vacuum. Ft was- developed to meet a pre-ccnceived 
need for IDEN of something, and it was developed on a hypothesis 
that for some (specified) reason this particular treatment would in 


fact accomplish the needed improvement. That is, the treatment was 
\ . 


Z 


; expected to influence cases in some particular way. The assumptions and | P 
“goais of the treatment, then, provide a starting point for a‘ functional | 
analysis of its real influence. There are three criteria relating to 
the treatment attributes that can be used to increase either the 
strength or the range of generalization about its influence. | 

| seg Wide Wide Range of treatment attributes across replications. Lt 
Es Tae that. administration of the. Ereatinn was varied ehianely (or 
even greatly} across cases. That is, ‘eth case received a different 
variation of the treatment. “The nature and &xtent of these variations 
may provide evidence of. ate: treatment s utility: across contexts, but 
more importantly they may allow the evaluator to identify canes attri- 
a bites of the treatment that appear to be functionally contributing to 
the outcones, That is, they may allow the evaluator to separate the 
relevant froh the irrevelant attributes of the area) so that con- 
clusions at baked on more Hindenenias saeeenieiiee rather than 
surface characteristics of the treatment. Now,: to the extent that 


surface characteristics are defined, the treatment is less malleable 


and therefore less likely applicable jin many different contexts. To 
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the extent that treatment variations allow more fundamental attributes 
to be identified, then, they also allow conclusions about effects to 
be stated in terms of the fundamental basis of the treatment and hence 
increase the :range of :generalization. 

2s Common patterns of treatment outcomes acroSs- sample cases. © 
The extent to which the treatment’ yields a similar set of outcomes or 
non-outcomes across the different cases lend credence to the hypothesis 
that the treatment did in fact have a predictable influence on the 
cases, even if the influence was not what was hypothesized. This 
criterion is further strengthened, of course, if the consistent pat- 
terns are observed ina sample of cases with'varying attributes. It 
should be clear by now that the value of these criteria is not in 
their individual merits but rather in their collective use. This 
criterion is no exception. A replicated pattern of outcomes may do 
more for,one's causal inferences than for oné's generalization infer- 
ences; if it is not accompanied by some of these other criteria. 

3. Coimort treatment functions across cases. This final cri- 
terion for the treatment, its functional influence, refers to the 
reason for the ‘treatment effect. This criterion is our strongest 
criterion, but it too relies on other criteria. The advantage of 
the single-case methodology is that it forces the evaluator to look 
at the functional relationships between the trextuant and the subject(s). 
Since the evaluator must assess the treatments influence on each case 


individually, he is more likely to discover the fundamental character is- 


-tic of the treatment and the extent to which the nature of treatment 


influence was related to sample or context attributes. Ultimately, * 
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it would not be a treatment effect which was generalized -- it would be 
a relationship between treatnient, context, and recipient that was 
“generalized. And since the treatment may be re-cast in its more funda- 
mental form, the conclusion may be applicable to a wider range of con- 
texts or subjects. . 
Example. Suppose an evaluator is asked to access the effects of 
a "trial" trip to the zoo on childreis attitudes toward school. The 
evaluator considers this to be a single-case study with the classroom 
_ as the single unit receiving the treatment. He accompanies the teacher 
and her children to the zoo, observes children and interviews then 
about their attitudes. Every child he speaks to is delighted by the 
trip, but different children are pleased for different reasons. . Bobby, 
for example, liked the trip because it was an escape from school, which 
-he hated. Mike, on the other hand, liked the trip because it helped . 
him understand his recent science lessons. Mike was anxious to return 
to school and continue his science lessons with the aid of his experi- 
ence. The evaluator's Fusetional analysis of the treatment allowed’ ~ 
shim not only to determitie the effects of the treatment on attitudes, | 
but also to determine the reasons for the effect. Further’analysis 
of his data lead him to conclude that: 
(a) the classroom was not the appropriate unit of 
analysis for two reasons. First, because the 
treatment was not functioning in the same way 
across children within the classroom, and 
second, because it was administratively possible 


to provide the treatment to individual children 
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“yather than to classroom units. 

(b) The hypothesized outcome of the treatment, Abia 
to increase children's positive attitudes toward 
school, only occurred on a subset of children-- 
those children which had never visited a z00 
before. That is, it was the uniqueness of the 
experience that fntlusneed the children, ” 

The evaluator chose to generalize his findings, not only to all 
children who had never visited a zoo before, but also to other kinds of . 
treatments, a concluded that: + 

| (c) Excursions to places previously unfamiliar # the 
~ children, but which related to current curricular 
content, would probably produce similar attitudinal 
change. dias y 

Functional analysis allows the evaluator to redefine the treat- | 

ment in a more fundamental way, and to describe the treatment effect 


in relationship to both context and recipient. 


‘ Generalizing from Single-Case 
The foregoing has shown the advantages of replicated single-case 


designs. Bteaacreaation aitens a functional analysis of the treatment, 

‘while replication allows an analysis of variability and commonality ; 
of sample attributes. Clearly a study of a single-case with no repli- 

cations limits the strength of generalization arguments considerably. 

It does not preclude functional: analysis, however, nor does it pre- 


clude a description of the relevant common and ynique attributes of 
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* Uhe case. But, in fact, the range of generalization simply cannot be 
‘known to the evaluator. . 

That the range cannot be known, however, does not mean that a 
range does not.exist. I mentioned earlier that the range of generali- 
zation was necessarily a matter of judgment, and I would now like to 
Suggest that for studies of single cases, the judgment should.not be 
made by the evaluator. Instead, it should be made by those individuals 
who wish to apply the evaluation findings to their own situations. 
That is, the evaluator should produce and share the information, but 
the receivers of the information must determine whether it applies to 
their own situation. Since the evaluator cannot know who his receivers 
are, he must, of course be quite specific both in his description of 
the attributes of his case and in his description of the way in.which 
“the treatment influenced “this case. 

» Researchers and evaluators are not accustomed to the notion of 
leaving generalization up to the practitioner, but it is, not an un- 
“common occurrence in other fields. In fact, some fields eVen specify 
" eriteria for generalization. Two, th particular, are worth reviewing: : 
Legal general izations and clinical generalizations. 
Legal Precedent 

The term "case law" refers to that portion of wie law that is 
built up from specific cases, rather than from statins: These speci- 
fic cases are resolved on the basis of statutes, but the interpreta- 
tions of statutes that are wade in-each case set precedents for future 
cases. If decisions are described in terms of general ideas, these 


ideas may become principles and take on a life of their own (Cardoza, 


« 
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1921). Though these decisions may be stated with fhe intention that 
they be generalized, it is the later court which must decide whether 
in fact a particular decision should be applied to its own case. Thus, 
it is the receiver of the information who determines its generalization 
to a new situation. For that reason, the rules by which these judgments 
of generalization are reached might be-useful to the educational 
decision-maker who needs to judge the generalizability of a single- 
case study to ‘Ais own situation. ‘ 

| How are these decisions made? The process is one of search and 
comparison (Cardoza, i921), in which the attributes of the current 
case are compared with the attributes of a variety of other cases. 


That case which is most analogous, that if, which has the most similar 


a] 
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attributes is selected as the most revelant precedent. The process, st 
then, is gin ier to that which we have already described for generatiz- 
ing, and its validity hinges on the extent.to which the attributes Ch 
compared are relevant. Legal tradition focuses primarily on thesé. 
attributes: . a, 
(a) the material facts or GRE wander case (are they 
‘ similar to the current’ case?) 
(b) Appropriateness of the decision made in the case : a 
(would a similar decision here be consistent with 
one's sense of justice and ‘social welfare?) 
(c) the reason for the decision (the statute used to 
justify the decision, and/or the justice's re- _— 


formulation of the statute) F 


e 
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(d) the level of sanceeitiy with which the decision was 
formulated (the extent to which an intent to general - 
‘ize seems apparent) - é 
Let's cansider how each of these might be applied by aa educational 
decision-maker. 
(a) Material Facts. In a court ak Yeu, material facts might 
consist of motivations, actions or transactions. In education, mate- 
rial facts may range from attributes of the: recipients of treatments, 


to the treatment, to the context in-which the study was done. They 


would also include the reasons for trying the treatment, i.e., the 


anticipated outcomes. The judicial process recognizes that not all 


material facts are relevant (just as we have also noticed) and pro- -: ae 


‘vides a further stipulation that those facts which were relied upon 
- in making the earlier decision are the inost relevant facts. Suppose,, 


‘for example, that an educational dactstoniuhed is interested in: 


improving children’ S attt tudes toward school and he reviews the case 


study previously described. The relevant material facts are those eo 
: used as a basis for the decision -- "children who have never visited 
a zoo" .and- "congruence batwean subject matter of the excursion and 


_ subject matter in the classroom". 


(b) Appropriateness of the Decision. In a legal matter, as: 


in an educational matter, appropriateness is determined by one's own 
value system. In law, the criterion is one's sense of justice or 


social welfare. In education, the criterion might ae same, or it ‘ 


“could be one's sense of the purpose or goals of education. a might 
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« seem odd to think that enough disparity could exist between the former 
case and the current one that the earlier decision could be completely 
“inappropriate now, but this criterion is an important one for it 
recognizes the fluidity of our social fabric and the need to change 
precedents (and educational treatments) as social values change. 

(c) Reason for the decision. A court decision wit neces- + 
sarily be based on a statute or a re-formulation of a statute and may 
be justified by demonstrating the relationship between the material 
facts in the case and the intent of the statute. This reasoning is 
then used by the later court to determine whether the earlier case 
provides a relevant precedent. In education, laws are not available 

~ to base case-study conclusions on, but a funet onal! analysis of the 
treatment may provide a similar ieee in the sense that it, .too, 
provides.a justification for reaching a decision as to thé effects 
of the treatment. Suppose, for example, an educational decision-maker 
is contemplating the use of computer-assisted instruction and reviews ns 
the rent feaked sifigle-case study described earlier. Suppose, further, 
that the earlier decision had been that CAI was not ‘useful because 
only those children who had emotional problems benefited from it. AY 
our current decision-maker_happens to be ve school for delinquent 
boys, he may determine that the reason for rejecting CAI earlier did - 
not hold for his current situation. Since many of his students had 


emotional problems, ‘CAI may be valuable in his situation. 


* 


(d) Generality of the Decision. A legal example of generality 


has been shown by Gottlieb (1968) in a negligence case. A bottler was 


considered negligent because a consumer had found a dead snail ina 
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bottle of beer. The judge could have argued negligence for that spceci- 
fic reason, or because da dead creature was in a sealed bottle, or be-~ 
” cause a foreign object was in a container. The latter is clearly a 
more genera] statement than either of the others. Similarly, our 
evaliiator of the effects of the excursion to the zoo chose to make a 
general decision about the relationship between the curriculum, the 
treatment, and the children's past experience, rather. than’ simply 
saying that trips to the zoo were beneficial when animals were being 
studied and children had never been to a zoo. 

' These four criteria for generalizing from cases are clearly 
judgmental. -- they are designed to provide guidance to users of infor- 
mation, not to those who generate it. In law, these two groups are 
synonymous , for a given judge may be a precedent- -setter in one case 
and a precedent-user in another. The.fact that he is sometimes a 
“user makes him conscious of the potential implications of his deci- 
sions when he is a creator, ‘Educational evaluators may be a a dis- 
advantage if they operate as ‘prachdante -setters and remain segregated 
from the population of decision-makers who miei wish to generalize 
their decision to other contexts. 

Clinical_treatments 

Medical and clinical psychological professions serve individual 
clients and much of their clinical knowledge develops from the accumu- 
lation of findings about treatment effects on individuals. Individual 
cases may be studied to learn more about etiology, more about particu- 
lar treatment effects, or because the cases are unique and need to be 


carefully studied to deterniine whether in fact the patient has a new, 
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+ as yet unrecognized ailment. To the extent that the purpose of the 
study is to facilitate classjfication, no inference is necessarily 
involved. We will confine ourselves here to those cases where an - 

* inference of generalizability is involved. Like generalizations in 
law, clinical generalizations are the responsibility of the receiver 
of information rather than the original generator of the information 
and so that evaluator must be careful to provide sufficient informa- 
tion to make such generalizations possible. Smal1 and Krause*(1972) 
have identified three important criteria for inclusion in a clinical 
report. 

' (a) longitudinal information (both extensive case 


history and extensive follow-up after treat- 


ment) 
Ss ote a 
(b) Multi-disciplinary assessment of patients 
(Representation of a variety of specialties 
and perspectives): 


(c) Precision in description (rather than impre- 
cise or vague terminology) 


~ Let us consider the application of these criteria to educational case 
studies. = 

(a) Longitudinal information. In clinical research, longi- 
site data may provide information regarding the past (onset of 
symptoms, frequency or nature of other ilinesses, etc.) or the future 
(recidivism, side effects, etc.) These findings place the clients' 
illness ‘and treatment in a context of his werlera health and well-being. 
In education, longitudinal data provide a similér’ service in. the, sense-. , ek ye 
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that the function of the treatment vis-a-vis the subject is understood 
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relative to the history and development of that subject or context. Such 
information helps the user of the case study to estimate the similarity 
between the evaluator's’ case and his current case. 

(b) “Multidisciplinary Assessment. In medical inquiry, the use 
of a variety of specialists insures that all aspects of the clients' 
health are considered and thereby diminishes the possibility of unwitting . 
mis-diagnosis or of confounding of different health problems. In education, 
similar advantages may be obtained by use of educational psychologists, 
sociologists, and developmental psychologists, each of whom may bring 
their own perspectives and interpretations to bear on the case. Use of 
several disciplines increases the breadth of information generated re- 
garding the case and, hence, assists the user to know the full range of 
— circumstances surrounding the cade. 

(¢) Precision of description. Rather than simply describing a 
client's ailment by its categorical label, full description of its 
nature and degree communicate’ more precisely the client's situation. 

An application of this principle to education might mean that, for 
example, a child's reading ability would be described for oral versus 
silent reading, reading of different kinds of material, interest in 
reading, and so forth, rather than described Simply by a grade-equiva- 
lent score. 

In both legal and clinical fields, then, we see that generaliza- 
tions .are frequently ss ecany from single cases, but it is also clear 
that these generalizations are done by the user of the case data rather, 
than by the person who abigindted the case data. And the generalization 


is not from a case to’a population but rather from a case to another case. 
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Because the generalization is from one case to another, the user must 
rely on as much information as possible to determine the ways in which 
the two cases are analogous. 

This approach to generalization seems appropriate to the Field of 
education, for education is a highly decentralized industry with great 
variability among different practicing agencies. Though the evaluator 
may not care for the ambiguity of this situation, and may feel obligated 
to define his range of generalization precisely, he must not forget that 
pee undiees of his care, users will ultimately make their own decisions 
as to whether his findings are applicable in their situations. In fact, 
the user will probably, Like the fade in court, study an array of avail- , 
able examples and pick the one which most closely approximates his own 
situation. To the sete that this is so, single-case studies will prove | 
more valuable to educational decision-makers than group studies since . 


these studies may not allow generalization to individual cases. 


Summary 

The fnpor tants of generating some set of rules for non-statistical 
inferences should not be underestimated. Part of the aye of grouped 
studies is that wel'-defined statistical rules have been developed to aid : 
our inferences. The clarity and specificity of these rules makes it relatively 
easy for different people to agree on the generalizability of a study. It 
is no trivial matter to embark on a new course for which the rules of ‘ 
inference have not, been established. For that reason, I have tried to 


offer a starting point for ie development of an alternate set of rules: 


“Surely these can be improved and other rules can be added, and just as 
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surely, we are a long way from approaching the sophistication -that 
our statistical rules have realized. 

An important topic which I have not addressed is the relative 
strength of these different rules, and I have avoided that issue 
intentionally for I think their relative utility may depend on the 
subject being investigat@d and the available knowledge and theory 
about that subject. 

The rules suggested here may appear to require a larger dose 
of subjective judgment than statistical rules do, but this may not be 
see cuve in practice. For in conducting a multiple-case study, judg- 
ments are made as to how to sample, how to design the study, and what 
typeof statistical analyses to employ. Furthermore, since samples 
are rarely drawn randomly from the population to which inferences are 
desired, judgments are used for the upon span of Cornfield and Tukey's 
(1956) two-span bridge of tnbevences | 

Judgments must be similarly employed in single-case studies, but 
they need not be whimsical. Though educational program evaluation as 
a field has existed for little more than a decade, a considerable body 


of theory and knowledge exists that can provide a basis for these 


judgments. 
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